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1.  INTRODUCTION 

During  (he  past  few  years,  considerable  effort  has  been  directed  towards  the  solution 
of  nonlinear  initial-boundary  value  problems  in  structural  mechanics.  The  finite  element 
method  is  one  of  the  more  popular  approaches  employed  to  reduce  continuum  problems  to 
nonlinear  algebraic,  discrete  problems.  Finite  element  methodology  is  well  documented  in 
the  literature  (e.g.,  see(23])  and  attention  here  is  devoted  to  procedures  which  may  be 
employed  to  solve  the  resulting  nonlinear  algebraic  problem.  The  general  class  of  problems 
of  interest  include  both  material  and  geometric  nonlinearities. 

A  brief  review  of  the  application  of  the  finite  element  to  problems  in  nonlinear  con¬ 
tinuum  mechanics  is  presented  in  Section  2.  The  result  is  a  large  set  of  nonlinear  algebraic 
equations  which  must  be  solved  for  the  state  variables  (e.g.,  displacements  and  stresses). 
Newton’s  method  is  commonly  employed  to  solve  the  nonlinear  equations.  In  Section  3  dis¬ 
cussion  of  Newton’s  method,  including  operation  count  estimates,  and  advantages  and 
disadvantages,  is  presented.  In  addition  modified  Newton’s  method  is  also  discussed  in  this 
section.  In  general,  Newton's  method  possesses  ideal  characteristics  of  local  convergence 
and  stability  but  is  normally  too  expensive  to  employ  for  solving  large  finite  element  prob¬ 
lems.  On  the  other  hand,  modified  Newton’s  method  has  desirable  operation  count  esti¬ 
mates  for  each  iteration  but  becomes  expensive  to  employ  because  of  its  low  convergence 
rate  characteristics.  Recently.  Matthies  and  Strang  (10]  have  suggested  that  quasi-Newton 
methods  be  used  to  solve  nonlinear  finite  element  problems.  In  Section  3.3  some  of  the 
quasi-Newton’s  methods  which  have  been  used  in  optimization  methods,  such  as  Broyden’s 
method  and  the  Broyden-Fletcer-Goldfarb-Shano  (BFGS)  method,  are  discussed  and  a  dis¬ 
cussion  of  the  advantages  and  disadvantages  for  these  methods  is  presented. 


The  use  of  iterative  equation  solvers  as  an  inner  loop  for  the  Newton’s  method  has 
been  suggested  before  and  recently  Eisenstai  completed  a  study  of  the  convergence  proper¬ 
ties  of  these  methods.  In  section  4  the  use  of  the  Lanczos  algorithm  for  iterative  solution  of 


linear  system  of  equations  is  described.  The  relation  of  the  Lanczos  algorithm  to  the  corru¬ 
gate  gradient  method  and  the  effect  of  preconditioning  on  the  algorithm  is  demonstrated. 
The  preconditioned  Lanczos  algorithm  is  used  to  solve  the  linear  system  of  equations  aris¬ 
ing  at  each  iterate  of  Newton’s  method.  This  results  in  a  technique  where  the  rate  of  con¬ 
vergence  of  the  method  can  be  varied  from  a  linear  to  a  quadratic  convergence  rate  by 
means  of  a  specified  tolerance.  In  section  5  a  detailed  description  of  the  Newton-Lanczos 
method  is  presented.  The  above  methods  are  tested  on  various  nonlinear  problems  in 
structural  mechanics  and  the  results  are  tabulated. 


I 
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2.  THE  FINITE  ELEMENT  METHOD  FOR  NONLINEAR  PROBLEMS 

Since  its  introduction,  the  Finite  Element  Method,  FEM,  has  become  one  of  the  most 
widely  used  techniques  for  solving  various  problems  in  continuum  mechanics.  The  FEM 
has  established  a  systematic  procedure  for  discretization  of  complex  structures  which  results 
in  a  large  set  of  nonlinear  equations. 

The  source  of  nonlinearities  are  of  two  types: 

(a)  material  nonlinearity 

(b)  geometrical  nonlinearity 

Addressed  here  are  problems  associated  with  both  nonlinear  constitutive  relations  and  non- 
linearities  due  to  large  displacements. 

In  the  following  section  a  summary  of  the  formulation  of  problems  in  continuum 
mechanics  is  presented.  The  resulting  equations  are  then  discretized  using  the  Galerkin 
method.  The  usual  convention,  that  implies  summation  with  respect  to  a  repeated  index  in 
a  term,  is  adopted. 

2.1.  Equations  of  Kinematics 

Consider  a  body  with  reference  configuration  at  time  (  -  0,  denoted  by  Bo,  that  is 
deformed  by  a  motion  to  a  current  configuration  B.  Figure  2.1  illustrates  this  motion.  Using 
a  common  rectangular  cartesian  coordinate  system,  the  Lagrangian  description  of  the 
motion  can  be  expressed  as 

x-  S  (X,t)  -  X  +  n(X,»)  (2.1.1) 

where  *  is  the  current  position  at  time  t  of  a  particle  with  position  vector  X  at  time  /  -  0 
and  a  is  the  displacement  the  particle  undergoes  in  this  motion. 

The  deformation  gradient  of  the  motion  is  defined  by 


frutn 


2.2.  MmtntiiB  Balance 

Equilibrium  conditions  between  internal  resisting  forces  and  externally  applied  forces 
must  be  satisfied,  at  least  in  the  weak  form,  whether  the  displacements  or  strains  are  large 
or  small.  The  momentum  balance  equation  is  given  by 

DivP  +  b  -  pah’  (2.2. 1. a) 

or 


Pnj  +  t>,  -  Poii< 


(2.2.  l.b) 
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where  Pi,  ere  the  components  of  the  Arst  Piola-Kirchhoff  (unsymmetric)  stress  tensor  P,  b , 
are  components  of  the  body  force  b,  Po  is  the  density  in  the  initial  configuration,  and  a 
superposed  dot,  ( ' ),  denotes  differentiation  with  respect  to  time. 

A  weak  form  of  the  momentum  balance  equations,  equivalent  to  virtual  work,  may  be  con¬ 
structed  by  multiplying  2.2. l.b  by  an  arbitrary  function  integrating  over  the  domain  of 
interest  1),  using  integration  by  parts  on  the  stress  term  and  setting  the  result  equal  to  zero. 
The  result  is 

| ( Wtjpu  -  W,b,  +  -  £  Wf.dr  -  0  (2  2.2) 

where  in  addition  to  previously  defined  quantities,  r,  is  a  specified  traction  and  r  2  is  the 
part  of  the  boundary  where  traction  is  specified.  For  equation  2.1.2  to  be  valid,  W,  must 
vanish  on  T|,  that  part  of  the  boundary  where  displacements  are  specified  (i.e.,  u,  —  S,  on 
F  i ).  The  traction  is  related  to  stress  through 

r,  -  N,P„  (2.2.3) 

where  TV/  are  direction  cosines  of  the  outward  normal  to  T  in  the  initial  configuration. 

The  components  of  the  first  Piola-Kirchhoff  stress  tensor,  is  replaced  by  Sy,  the  com¬ 
ponents  of  the  symmetric  second  Piola-Kirchhoff  stress  tensor,  S,  using  the  relation 

P„  -  S„FU  (2.2.4) 

where  Fj  are  the  components  of  the  deformation  gradient  matrix.  This  results  in 

J  ( VjSijFtf  -  Wfr  +  n  -  /  W'T^r  -  o  (2  2  5) 

n  Tj 

2.3.  CMStitntlve  Relation 

In  general  the  deformation  gradient  of  the  motion,  the  temperature  and  the  tempera¬ 
ture  gradient  determines  the  stress  state.  The  effects  of  inelastic  material  behavior  can  be 
modeled  through  internal  variables. 

For  isothermal  deformation  the  constitutive  equation  can  be  written  as 


where  'J'  is  the  Helmholz  free  energy  defined  at  (  X  ,  t  )  through  the  strain  tensor  and  the 
internal  variables 

♦  (X,/)  -  $(E,a)  (2.3.2) 

The  form  of  the  constitutive  relation  stated  above  is  currently  under  debate.  A  more 
comprehensive  discussion  can  be  found  in  reference  [14].  In  addition  to  the  previously 
defined  quantities  a  is  a  vector  of  internal  variables.  Further,  a  constitutive  relation  of  the 
form 

a  —  y(S,a)  (2.3.3) 

is  assumed  for  the  internal  variables.  It  is  sometimes  more  convenient  to  express  these 
relations  with  strain  rate  as  the  dependent  variable 

E  -  CS  +  A«  (2.3.4) 

where  C  is  the  instantaneous  elastic  compliance  tensor,  and  A  is  the  inelastic  compliance 
tensor  which  depends  on  the  stress  state,  the  internal  variables  and  the  velocity  gradient,  L, 
in  such  a  way  as  to  satisfy  objectivity 

A  -  A(S,cr,L)  (2.3.5) 

This  form  of  the  model  is  most  suitable  for  nonlinear  viscoelastic  or  elastic/viscoplastic 
materials.  The  constitutive  equations  are  both  nonlinear  and  rate  dependent.  Writting  2.3.4 
in  indicial  notation 

£/;  "  Cukl  Skl  +  Ay(*)d mo  (2.3.6) 

where  /?- 1,2,  -  -  -  ,A  and  Af  is  the  number  of  internal  variables. 

The  weak  form  of  this  equation  can  similarly  be  obtained  by  multiplying  2.3.6  by  an  arbi¬ 
trary  function  VIJs  and  integrating  over  the  domain  0.  This  leads  to 
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/  ViAEu  ~  CuklSkl  ~  A  */(*)£<*))</ ft  -  0  (2  3  7) 

n 

where  <»<#)  is  determined  form  equation  2.3.3  . 


2.4.  Finite  Element  Discretization 

In  a  finite  element  model,  the  domain  ft  is  divided  into  elements  ft,.  By  sampling 
the  solution  for  the  Aeld  variables,  such  as  displacements  or  stresses,  at  a  number  of  points 
in  the  elements,  known  as  nodes,  an  approximate  solution  can  be  constructed.  When  the 
stresses  are  known  explicitly  in  terms  of  the  strains  it  is  possible  to  choose  the  displace¬ 
ments  as  the  only  independent  variable  and  the  solution  to  the  nodal  parameter  can  be 
obtained  using  the  weak  form  of  the  equilibrium  equation,  2.2.S  .  However,  in  cases  when 
the  constitutive  relation  is  more  complex  it  is  simpler  to  use  a  weak  form  of  this  equation, 
2.3.7  ,  and  use  both  stresses  and  displacements  as  primary  variables. 

In  each  element  the  approximations  for  u,  and  Su  which  are  C°-continuous  and  piecewise 
constant  (  C  '-continuous  ),  respectively,  over  ft,,  are  in  the  form 

u  -  (2.4.1.a) 

and 

S  -  2>a(X)S,(f)  (2.4. l.b) 

B 


where  NA{\)  and  MB(X)  are  shape  functions  over  the  domain  ft,  satisfying  the  C°and 
C~'  continuity  requirements  cited  above.  The  arbitrary  weighting  functions  are  also 
approximated  using  these  shape  functions,  as 

W-£/V,(X)W„  (2.4.2..) 

and 


V  -  £AfB(X)VB 

B 


(2.4.2.b) 


where  W  A  and  V  B  are  arbitrary  nodal  parameter.  Using  a  temporal  finite  difference 
scheme,  such  as  Newmark’s  method  (13],  the  time  derivatives  in  equations  2.2.S  and  2.3.7 
can  be  removed. 

The  result  of  the  application  of  the  finite  element  method  and  the  time-stepping  pro¬ 
cedure  is  a  large  set  of  nonlinear  algebraic  equations  at  each  time  step,  tm,  with  nodal  dis¬ 
placements  and  stress  quantities  as  the  the  unknowns.  The  dependence  of  these  equations 
on  the  stress  quantities  can  be  removed  through  static  condensation  at  the  element  level 
resulting  in  nodal  displacements  as  the  unknown  parameters. 

These  equations  can  be  written  as 

f(x(rj]  -  0  (2.4.3) 

where  x  is  a  vector  of  the  state  variables,  which  consists  of  all  the  nodal  displacements, 
u/(/m)  ,  and  f  is  a  vector  function  of  x  obtained  by  application  of  the  Galerkin  method  to 
equations  2.2.S  and  2.3.7. 
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3.  SOLUTION  OF  THE  DISCRETE  NONLINEAR  EQUATIONS 

Employing  the  finite  element  method  and  a  time  stepping  procedure  to  discretize  the 
nonlinear  structural  mechanics  problem  leads  to  the  large  set  of  nonlinear  algebraic  equa¬ 
tion  given  by  2.4.3  .  Presented  here,  are  a  number  of  methods  normally  employed  for  the 
solution  of  this  equation. 

The  Taylor  expansion  for  this  ,  truncated  to  include  only  the  linear  terms  can  be  written  as 


where  f 


f(x  +  Ax)  =  f(x)  +  f  (x)Ax 
df 

— —  is  a  matrix  with  coefficients  /'y  given  by 


r 


>j 


dxj 


and  Ax  is  a  small  change  in  x. 

Almost  all  nonlinear  procedures  are  based  on  this  approximation. 


(3.0.1) 


3.1.  Newton’s  Method 

Using  equation  3.0.1  Newton’s  method  can  be  written  as 

f(x*+I)  ~  f(x*)  +  A (3.1.1) 

where  k  is  an  iteration  number,  d*  is  a  change  in  x*  called  the  step  direction,  and  A*  is  the 
Jacobian  or  tangent  matrix  of  f  defined  by 


8f(x) 

dx 


Newton's  method  consists  of  setting  3.1.1  equal  to  zero,  solving  for  d*  from 


and  setting 


A*d*  -  f(x*) 

x*+ i  -  x*  +d* 


(3.1.2) 


(3.1.2) 

(3.1.3) 


I 

L 
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Newton’s  method  requires  the  initial  guess  xo  to  be  in  a  domain  D,  called  the  domain  or 
attraction,  such  that  convergence  to  the  exact  solution,  x*  in  D,  of  equation  2.4.3  does 
occur.  Furthermore,  f  must  be  differentiable  in  D  and  the  tangent  matrix  must  be  non¬ 
singular  at  the  solution.  In  practice  it  is  often  desirable  to  modify  3.1.3  to 

**+i  “  x*  +  s*d*  0.1.4) 

where  sk  is  a  scalar  step  size  which  is  used  to  enhance  stability  of  the  algorithm.  The  value 
of  sk  is  determined  from  a  line  search  as  described  in  the  following  section. 

The  algorithm  for  implementing  Newton’s  method  consists  of  choosing  a  good  initial 
guess  x0  and  repeat  following  steps  for  k  -  0,1,2,  •  •  •  until  convergence  is  achieved: 

(1)  given  X*  compute  f(xA) 

(2)  compute  the  tangent  matrix  A* 

(3)  solve  A*d*  -  -f(x*)  for  ik 

(4)  compute  sk  from  a  line  search 

(5)  update  x*+,  -  x*  +  j*d* 

(6)  test  for  convergence  and  terminate  if  converged 

There  are  several  procedures  which  may  be  used  to  terminate  the  iteration. 

These  include: 

(a)  |f(x*)H  <  €|max||f(x,)|i 

f 

(b)  Kit  ||  <  <2max |(d,|| 

I 

(c)  |i*  f(x*)|  <  «3max|d,  f(x,)| 

# 

where  c,  are  small  positive  constants.  In  this  work  method  (a)  was  used.  Method  (c)  is  only 
applicable  for  symmetric  positive  definite  Jacobians. 
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3.1.1.  Line  Search  for  Newton’s  Method 

In  the  above  presentation  a  step  size,  s*,  has  been  included  to  improve  the  stability  of 
Newton’s  method.  The  magnitude  of  sk  is  obtained  in  such  a  way  as  to  minimize  a  norm  of 
the  residual,  f(x*).  A  common  procedure  used,  is  to  solve  the  following  nonlinear  scalar 
equation  for  s* 

Gisk)  -  Akfixk+skik)  -  0  (3.1.5) 

It  is  important  to  note  that  equation  3.1.5  need  not  be  solved  accurately  since  x*+i  is  itself 
only  an  approximation  and  also,  in  general,  the  additional  function  evaluations  are  costly  in 
finite  element  analyses. 

In  fact,  when  possible,  the  line  search  should  be  omitted  when  xk  is  close  to  a  solution. 

3.1.2.  Operation  Counts  for  Newton's  Method 

For  purposes  of  subsequent  cost  comparisons,  operation  counts  for  Newton’s  method 
are  estimated  to  indicate  the  relative  efforts  needed  in  each  step.  For  an  n  -dimensional  x, 
the  operation  count  estimates  are. 


(a) 

computation  of  f(x*) 

0(/i) 

(b) 

computation  of  A* 

Oin2) 

(c) 

direct  solution  of  equations 

triangular  decomposition  of  A  * 

OCn3) 

forward  reduction/backsubstitution 

Oin 

(d) 

line  search 

Oin) 

(e) 

update  of  solution 

Oin) 

(f) 

convergence  test 

Oin ) 

While  order  of  magnitude  estimates  are  given,  substantial  differences  in  real  effort  exist 
between,  for  example  (a)  and  (d).  The  majority  of  effort  is  associated  with  steps  (a)  to  (d), 
and  only  comparisons  between  these  steps  are  meaningful.  There  is  an  order  of  magnitude 
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increase  in  step  (b)  over  step  (a)  or  in  step  (c)  over  step  (b).  It  is  important  to  note  that 
these  estimates  are  somewhat  problem  dependent,  for  example  step  (a)  can  cost  as  much  as 
step  (b).  Consequently,  considerable  efficiency  can  be  achieved  by  eliminating  or  reducing 
the  number  of  expensive  steps  required. 

3.1.3.  Convergence  of  Newton’s  Method 

The  rate  of  convergence  of  an  algorithm  will  determine  how  fast  the  iteration  x* 
approaches  a  solution  x’  An  acceptable  algorithm  must  be  at  least  linearly  convergent  [3]; 
that  is,  given  a  solution  x*,  then 

||x*+,  -  x *||  <  o||x*  -  x *||  (3.1.6) 

where  a  is  a  positive  constant  less  than  unity.  Although  3.1.6  ensures  that  the  error  norm  is 
reduced  by  the  factor  a  in  each  iteration,  to  be  competitive  it  is  generally  acknowledged 
that  an  algorithm  must  have  better  than  linear  convergence.  When  Newton’s  method  has 
continuously  differentiable  f  and  a  solution  x*  in  D,  then  the  error  norm  satisfies  the 
stronger  condition 

||x*+1  -  x’||  <  a||x*  -  x‘||«  (3.1.7) 

where  q  is  a  positive  scalar  such  that  1  <  q  <  2  .  This  is  called  super-linear  convergence. 
In  addition,  for  cases  where  f  is  twice  differentiable  in  the  neighborhood  of  x*,  with  non- 
singular  Jacobian,  Newton's  method  is  quadraticaily  convergent  with  q  —  2. 

For  finite  elment  applications,  Newton’s  method  will  almost  always  have  at  least  super- 
linear  convergence,  and  most  problems  are  such  that  quadratic  convergence  will  be 
achieved. 

3.1.4.  Advantages  and  Disadvantages  of  Newton’s  Method 


Newton's  method  has  at  least  two  very  desirable  properties: 
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(i)  Any  x*  in  D  results  in  an  x*^  in  D;  consequently,  the  method  is  stable  and  conver¬ 
gent  once  an  iterate  is  in  D. 

(ii)  The  method  possesses  at  least  super-linear  convergence  and  often  quadratic  conver¬ 
gence  (31. 

On  the  negative  side,  we  note  that: 

(i)  If  D  is  small,  then  the  initial  approximation  may  have  to  be  very  close  to  x*  to  get 
inside  D.  We  do  not  know  when  xo  is  in  D. 

(ii)  The  triangular  decomposition  of  the  Jacobian  matrix  is  very  costly  in  large  finite  ele¬ 
ment  problems. 

The  requirement  of  a  good  initial  guess  may  be  avoided  in  part  by  using  line  searches 
and,  for  quasi-static  problems,  an  evolution  of  the  load  application  [23].  In  the  sequel  we 
address  the  possibility  of  reducing  computational  factorizations  of  the  tangent  matrix. 

3.2.  Modified  Newton’s  Method 

For  large  systems  of  equations,  the  main  cost  in  Newton’s  method  is  the  triangular 
decomposition  of  the  Jacobian  matrix.  The  use  of  a  previously  factored  Jacobian  is  often 
advocated  in  place  of  the  current  tangent  matrix.  The  algorithm  is  given  as,  for 
k-0,1,2,  •  repeat 

(1)  given  x*  compute  f(x*) 

(2)  solve  A,d*  -  -f(x*),  where  A,  is  the  tangent  matrix  at  step  i  <  k 

(3)  compute  sk  from  a  line  search 

(4)  update  x*+1  -  x*  +  s*d* 

(5)  test  for  convergence  and  terminate  if  converged 

Comparing  this  algorithm  with  that  of  Newton's  method,  we  observe  that  the  0(n*)  opera¬ 
tions  to  compute  the  Jacobian  matrix  and  the  Oln1)  operations  to  compute  the  triangular 
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decomposition  are  avoided.  The  algorithm  still  requires  OCn2)  operations  to  perform  a 
resolution  using  the  triangular  factors  of  A,.  However,  these  savings  are  achieved  at  the 
expense  of  the  convergence  rate  since  modified  Newton's  method  only  converges  linearly, 
as  given  by  equation  3.1.6.  There  exists  some  number  of  iterations  p  such  that  computation 
of  a  new  Jacobian  matrix  will  make  the  modified  Newton’s  method  more  efficient.  Unfor¬ 
tunately,  the  value  of  p  depends  on  the  degree  of  nonlinearity  of  the  problem  and  cannot 
be  estimated  easily. 

3.3.  Quasi-Newton  Methods 

Quasi-Newton  methods  are  a  generalization  of  the  one-dimensional  secant  method  to 
the  n -dimensional  problem  (e.g.,see  [11}  and  [12]).  In  the  secant  method,  an  approxima¬ 
tion  to  the  tangent  matrix.  A*,  is  used  at  each  iteration.  This  concept  is  applied  in  multi¬ 
dimensions  and  a  simple  updating  is  deduced  to  compute  the  new  A*  from  the  previous 
value  of  A*_|.  The  staring  matrix  Ao  is  normally  taken  as  f  (xo),  however,  other  choices  are 
possible  such  as  the  finite  difference  approximations  [3].  The  convergence  rate  for  all  prac¬ 
tical  quasi-Newton  methods  is  super-linear,  and  the  number  of  numerical  operations  for 
each  iteration  is  Ofn1). 

To  deduce  the  equation  for  the  updating  of  the  Jacobian,  known  as  the  secant  equa¬ 
tion,  a  linear  backward  Taylor  formula  at  step  k  is  used. 

f(x*)  =  f(x*_|)  -  f(x*)(x*  -  x*_|)  (3.3.1) 

This  equation  can  be  rearranged  to  obtain  the  approximate  Jacobian  A  * 

A*d*_i  *“  b*_i  (3.3.2) 

where 

d*-,  -  x*  -  x*_,  (3.3.3) 


and 
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b*_,  -  f(x*)  -  f(**_,)  (3.3.4) 

The  values  in  3.3.3  and  3.3.4  must  be  computed  to  perform  the  step 

A*d*  -  -f(xk)  (3.3.5) 

Equation  3.3.2  is  called  the  quasi-Newton  equation,  which  must  be  satisfied  by  A*.  In  a  one 
dimensional  problem,  this  equation  is  sufficient  to  determine  Ak  completely,  however,  for 
multi-dimensional  problems  additional  constraints  are  required  to  obtain  Ak.  The  construc¬ 
tion  of  the  approximate  Jacobian  depends,  to  a  large  extend,  on  the  choices  for  these  con¬ 
straints.  The  work  of  Broyden  and  Broyden-Fletcher-Goldferb-Shano,  BFGS,  are  but  two  of 
the  many  possibilities  for  constructing  Ak. 


3.3.1.  Broyden’s  Method 

In  1965  Broyden  proposed  a  method  for  the  approximate  specification  of  the  tangent 
matrix  by  a  simple  update  of  the  previous  value.  Broyden  assumed  that  Ak  does  not  differ 
from  Ak-i  when  acting  on  any  vector  orthogonal  to  d*_(.  accordingly 

A*z  -  At_|i  (3.3.6) 

for 

*rd*-i  -  0 


These  equations  provide  an  update  relation  for  Ak  and,  together  with  equation  3.3.2,  allows 
A*  to  be  determined  uniquely.  Thus,  Broyden's  method  is  given  by 


A k  ”  A*_|  + 


(b*-i  -  A*_td*-|)d*l| 


(3.3.7) 


Post-multiplying  3.3.7  by  d*-i  and  z  results  in  equations  3.3.6  and  3.3.2,  respectively, 
thus  demonstrating  the  applicability  of  3.3.7  .  This  equation  can  be  written  in  the  computa¬ 
tionally  more  attractive  form,  using  equation  3.3.4 


17 


A*  ■  A*_|  +  -  ||^*|  '  (3.3.8) 

Broyden's  method  is  super-linearly  convergent  (3).  Given  an  initial  Xo  and  Ao  and 
using  the  above  equations  the  quasi-Newton  step  can  be  performed.  The  algorithm  is  ident¬ 
ical  to  that  for  Newton’s  method,  except  the  computation  of  the  Jacobian  is  replaced  by 
equation  3.3.8  .  The  factorization  of  A  *  at  each  step  is  still  required;  consequently,  it  is  best 
to  directly  update  the  inverse  of  A*-t  to  obtain  the  inverse  of  A*.  This  will  eliminate  the 
need  to  compute  the  triangular  decomposition  of  A*  and  therefore  will  result  in  an  algo¬ 
rithm  with  Of/i2)  operations  in  each  step. 


3.3.2.  Cempatatioa  of  the  Inverse  Matrix 

The  inverse  of  matrices  defined  similar  to  3.3.8  may  be  written  as 

(A  +  vwO-1  —  A"'  — ^A"VwrA_i  (3.3.9) 

«r 

where 

<r  —  I  +  wrA-lv 


If  we  let  H*  be  the  inverse  of  A*,  Broyden's  method  for  the  update  of  the  inverse  becomes 

(d*-i  ~  H*_ib*-i)d/liH*-i 


H* 


8*r-iH»_|b*_i 


(3.3.10) 


provided  d/-|H*_|b*_i  is  nonzero.  Broyden’s  method  may  be  implemented  as 

d*--H*f(x*)  (3.3.11) 

3.3.3.  Convergence  of  Qaasl-Newton  Method 

Convergence  properties  of  quasi-Newton  method  are  discussed  by  Dennis  and  More 
in  (31.  In  this  study  they  restate  an  earlier  result  that  an  iterative  method  is  super-linearly 
convergent  if 
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H lA *  -  f(«*)](*ik+i  -  x*)|| 

a — 1«»,-  >.i — 


(3.3.12) 


If  A*  converges  to  f(x*),  as  for  Newton’s  method  where  A*  equals  f  (x*),  then  conver¬ 
gence  is  always  super-linear.  However,  the  important  result  of  3.3.12  is  that  when  A*  only 
converges  to  f(x*>  in  the  direction  4*.  convergence  is  also  super-linear.  Both  the  Broyden 
and  the  BFGS  quasi-Newton  method  satisfy  3.3.12  for  continuously  differentiable  f  and 
thus  are  super-linearly  convergent. 


3.3.4.  Methods  for  Symmetric  Positive  Definite  Jacobians 

Brodlie,  Gourlay  and  Greenstadt  have  shown  that  certain  rank-one  and  rank-two 
corrections  to  symmetric  positive  definite  matrices  may  be  conveniently  expressed  in  the 
product  form  (1] 

H*  -  (I  +  +  v*w *0  (3.3.13) 


This  form  has  the  important  property  that  successive  inverse  tangent  matrices  remain  sym¬ 
metric.  In  [10]  the  algorithm  is  related  to  the  BFGS  algorithm,  which  gives 


dr 


i*-i 


and 


(3.3.14) 


** 


f(x*_t) 


d*-if(y*>  | 


f(**) 


(3.3.15) 


The  computational  steps  for  implementing  the  BFGS  method  are  very  straight-  for¬ 
ward  and  consist  of  solving  3.3.11  with  following  steps: 

(1)  for  each  k  compute  f(x*)  and  r*  -  -  (I  +  v*w 

(2)  solve  H*i|U*  -  i* 

(3)  compute  d*  -  (1  +  w*v*)u* 
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(4)  if  required,  compute  a  line  search  for  sk 

(5)  update  -  **  +  s*d* 

(6)  check  convergence 

It  should  be  noted  that  H*_j  may  result  from  a  previous  BFGS  update.  Thus,  step  (1)  and 
(3)  may  require  several  updates  before  the  resolve  step  is  performed.  Furthermore,  this 
form  of  the  algorithm  is  strictly  limited  to  positive  definite  tangent  matrices.  Indefinite 
matrices  resulting  from  Lagrange  multiplier  constraint,  such  as  contact  problems,  can  not 
be  considered  by  present  implementation  of  BFGS  and  therefore  an  alternative  implementa¬ 
tion  is  required. 


4.  LANCZOS  ALGORITHM  FOR  SOLUTION  OF  THE  LINEARIZED  PROBLEM 


The  discretization  of  nonlinear  structural  mechanics  problems  and  linearization  of  the 
resulting  nonlinear  algebraic  equations  for  application  of  a  Newton  type  methods  normally 
will  lead  to  a  symmetric  system  of  linear  equations.  These  equations  may  be  written  as 

Ax  -  b  (4.0.1) 

Presently,  most  nonlinear  procedures  employ  a  direct  method  such  as  Gaussian  elimi¬ 
nation  for  the  solution  of  this  system  of  equations.  In  this  section  an  alternative  solution 
technique  is  presented. 

In  most  applications  A  will  be  sparse,  and  an  elegant  way  to  make  use  of  sparsity  is  to 
employ  A  solely  as  a  linear  operator  which  computes  the  product  Av  for  a  given  vector  v. 
This  has  the  added  advantage  that  A  need  not  be  known  explicitly  but  only  a  means  of 
computing  the  matrix  vector  product  is  required. 

For  years,  iterative  methods  have  been  used  for  the  solution  of  large  systems  of  equa¬ 
tions.  The  Conjugate  Gradient  method  is  one  such  technique  introduced  by  Hestenes  and 
Stiefel  (5].  This  method,  in  theory,  requires  at  most  n  steps  to  obtain  the  exact  solution  of 
4.0  1  and  was  accepted  because  of  this  termination  property.  In  the  same  year  Lanczos  pub¬ 
lished  his  method  of  Minimized  Iteration  which  was  initially  introduced  for  computing  the 
eigen  pairs  of  a  large  symmetric  matrix.  Lanczcs  Householder  (6]  pointed  out  the  inti¬ 
mate  connection  between  the  two  approaches. 

These  methods  have  several  attractive  features  in  common.  There  is  no  need  for  A  to  have 
further  special  properties,  such  as  banded  form,  no  acceleration  parameters  have  to  be 
estimated,  and  the  storage  requirements  are  only  a  few  n -vectors  in  addition  to  the  storage 
needs  of  A. 

In  finite  element  applications  the  matrix  vector  multiplication  can  be  performed  at  the 
element  level.  This  will  result  in  considerable  saving  in  storage  at  the  expense  of  some 
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additional  multiplications. 

4.1.  The  Lanczos  Algorithm 

For  certain  applications  of  the  finite  element  method,  especially  in  nonlinear  prob¬ 
lems,  it  is  usual  ;o  have  on  hand  an  initial  approximation  xa  to  the  true  solution  of  4.0.1  . 
The  problem  is  now  to  find  a  correction  xc  to  be  added  to  xa.  Then 

Axf-r0  (4.1.1) 

where 

ro-b-Ax*  (4.1.2) 

Lanczos  algorithm  may  be  described  very  simply  as  a  process  of  constructing  the  weak  form 
of  equation  4.0.1  from  a  very  special  subspace.  The  subspace  under  consideration  is  gen¬ 
erated  from  the  set  of  j  vectors  (  r0 ,  Ar0 ,  •  •  •  ,  A-^ro ),  known  to  mathematicians  as 
the  Krylov  subspace  (18).  To  construct  the  weak  form  it  would  be  simpler  if  an  orthonor¬ 
mal  set  of  vectors,  say  (qi.q*  -  -  -  ,qy),  were  available.  This  can  be  achieved  by  applying 
Gram-Schmidt  orthogonalization  to  the  Krylov  vectors.  Initially,  this  appears  to  be  an 
expensive  way  of  obtaining  an  orthonormal  base  vectors,  however  this  process  wa  be 
simplified  when  the  following  two  facts  are  used: 

(i)  The  use  of  Aq,  and  Ayro,  for  orthogonalization  against  the  previous  q  vectors  and 
normalization  of  the  resulting  vector,  leads  to  the  same  vector  ®y+|. 

(ii)  The  vector  Aq,  is  orthogonal  to  qi,q>  •  •  •  ,qy_2- 

These  results  are  demonstrated  in  Appendix  A.  A  more  complete  explanation  is  presented 
in  (18).  Therefor  it  is  sufficient  to  orthogonalize  Aqy  against  qy_j  and  qy  to  obtain  the  next 
orthogonal  vector. 

t,  =  0y+iqy+i  -  Aqy  -  a  (4.1.3) 
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where  at  -  q/Aq,  and  0,  -  q^|Aqy.  It  is  important  to  note  that  the  vectors 
(qi.qj,  '  .q,-2>  are  not  needed  in  equation  4.1.3  .  This  defines  one  step  of  the  Lanczos 
algorithm.  The  normalization  of  ry  results  in  qy+1.  It  is  easy  to  show,  by  looking  at  qy>iry, 
that  0,+,  -  ||r,||. 

The  special  choice  of  the  base  vectors  for  the  subspace  has  an  additional  advantage. 
The  projection  of  A  onto  this  subspace  is  a  tridiagonal  matrix,  Ty. 

«]  02 
02  012  03 

Ty-Q/AQ,  -  '  (4.1.4) 

«y-i0y 

0y  aJ 

where  the  q  vectors  form  the  columns  of  the  matrix  Qy,  Qy  =  (qi,q2,  •  •  •  ,q7). 

This  fact  was  realized  soon  after  Lanczos  introduced  his  method  and  the  algorithm  was  put 
to  use  as  a  process  for  the  orthogonal  transformation  of  a  matrix  to  tridiagonal  form. 
Despite  its  additional  attractions,  the  Lanczos  process  gave  way  to  Givens'  method  in  I9S4 
and  later  to  Householder's  method  in  1958. 

The  relationships  that  define  the  Lanczos  algorithm  can  now  be  summarized  in  the 
following  three  equations. 


Q/Q,  -  h 

(4.1.5.a) 

AQ )  ~  Q/Tj  -  r j«l 

(4.1.5.b) 

Q/r,  -  0 

(4.1.5.C) 

where  e,  is  the  y'-th  column  of  the  jxj  identity  matrix  1 1. 

Setting  q0  -  0  and  using  r0  as  the  starting  vector,  the  Lanczos  algorithm  can  then  be 
described  as 


Given  ro.  set  0,  -  ||r0|| 
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Fory  -  1,2,  ••  •  repeat 

(2)  Uj  —  Kqj 

(3)  tj  —  uj  -  PjUj-i 

(4)  etj  —  qjtj 

(5)  r>  —  r,  -  aj*j 

(6)  0,+1  -  || r y  || 

At  the  end  or  a  simple  Lanczos  step  the  newly  computed  Lanczos  vector  is  written 
into  secondary  storage. 

4.2.  The  Weak  Form 

A  weak  form  of  equation  4.0.1  can  now  be  constructed  using  equation  4.1.S  .  The 
approximation  to  the  solution  from  the  subspace  can  be  written  as 

xj  -  Q,f,  (4.2.1) 

where  f,  is  the  vector  in  the  subspace  defining  the  approximate  solution.  The  Mh 
coefficient  of  tj  and  its  corresponding  q,  are  much  like  the  nodal  parameter  and  its  asso¬ 
ciated  shape  function  in  the  finite  element  discretization. 

Equations  4.1.1  together  with  equation  4.2.1  results  in 

\Qjfj  -  r0  (4.2.2) 

This  is  an  overdetermined  system  of  equation.  However  a  solution  can  be  obtained  using  a 
procedure  similar  to  the  method  of  weighted  residqals,  and  an  approximation  can  be 
obtained  by  premultiplying  this  with  W/  The  weighting  vectors  Wj  can  be  chosen  in  such 
manners  as  to  reduce  the  residual  of  the  quantities  of  interest. 


24 


T 
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W/AQyf,  -  W  ji0 


A  Galarkin  type  method  results  when  the  Lanczos  vectors  are  chosen  as  the  weighting  vec- 


Q/AQ;f;  -  Q/r0 
Tjtj  -  pfiP 


(4.2.3) 


Here  ejj)  is  the  first  column  of  the  identity  matrix  and  the  above  is  derived  using  the  fact 
that  r0  -  qi/f  |.  The  above  equations  possess  the  orthogonal  characteristics  of  the  residual 
vector  peculiar  to  Galarkin  methods,  that  is  the  residual  vector  is  kept  orthogonal  to  the 
trial  vectors  and  therefore  is  ’minimized”.  In  appendix  B  equation  4.2.3  is  also  derived  from 
an  energy  consideration.  This  orthogonality  property  can  be  seen  in  the  form  of  a  mono¬ 
tonic  reduction  in  the  potential  energy  of  the  system  which  is  a  characteristic  of  both  the 
Lanczos  method  and  conjugate  gradient  algorithm.  In  Figure  4.1  this  is  shown  in  terms  of  a 
simple  beam  problem. 


4.3.  Termination  Criteria 

In  the  previous  section  the  Lanczos  algorithm  was  viewed  as  a  method  for  construct¬ 
ing  the  weak  form  of  the  linear  set  of  equations.  The  form  of  the  algorithm  suggests  that 
after  every  Lanczos  step  the  number  of  q  vectors  increase  by  one.  In  exact  arithmetic  the 
algorithm  can  be  terminated  after  n  steps  and  the  tridiagonal  matrix  contains  all  the  infor¬ 
mation  needed  to  solve  the  problem  exactly.  However,  it  is  often  sufficient  to  stop  the 
iteration  at  some  step  m  <  n,  when  the  solution  is  close  enough  to  the  exact  solution. 
Using  the  weak  form,  equation  4.2.3,  the  solution  vector 


f«i  “  (<^1,02,  '  '  T 


(4.3.1) 


can  be  obtained  by  solving  the  equations.  The  approximate  solution  xm  can  then  be  con¬ 
structed  directly  from  equation  4.2.1  .  The  residual  vector  is 


i 
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**  =  Al*  -  to 

-  AQ.f*  -  r0  (4.3.2) 

Setting  j  -  1  in  equation  4.1.3  the  following  can  be  derived 

ro-Pfli 

-  Q*  (*{m)Pl)  (4.3.3) 

This  together  with  4.3.2  and  4.2.3  results  in 

f*  -  AQ*f*  -  Q*T*f* 

-  (A Q*  -  Q*T*)f* 

-  (rmej)  tm 

-  rm4>m  (4.3.4) 

Therefore  the  residual  vector  is  a  scalar  multiple  of  the  rm  vector  computed  at  the  m-th 
Lanczos  step.  Moreover  the  residual  norm  pm  is  given  by 

Pm  =  II  *  n  II 

-  i*j  ii**ii 

-0*+.k*l  (4.3.5) 

Thus  by  computing  the  bottom  element  of  the  solution  vector  to  the  tridiagonal  system  the 
residual  norm  can  be  computed  using  a  simple  multiplication,  noting  that  0M+|  is  computed 
at  the  m-th  Lanczos  step.  In  fact  it  is  not  even  necessary  to  compute  the  solution  to  the  tri¬ 
diagonal  system,  f*  to  obtain  4*-  A  recurrence  for  updating  from  information  at  the 
previous  step  can  be  derived  by  considering  a  QR  factorization  of  T*.  Consider  first  a  fac¬ 
torized  form  of  the  tridiagonal  matrix  at  step  j  —  1 

T,-.  - 


(4.3.6) 
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where  Ry_i  is  (he  product  of  all  the  rotation  matrices  up  to  and  including  step  j  -  l  that 
transforms  Tj-\  to  the  upper  triangular  matrix  Uy_|.  Then  the  system  of  equations  4.2.3, 
for  step  j  ~  1,  can  now  be  written  as 

-  0|Rj-1ei(/-M  (4.3.7) 

The  last  equation  of  the  above  can  be  written  as 

"  y>-i  (4.3.8) 


Here  tilde  denotes  all  quantities  that  were  modified  by  the  all  the  previous  rotations  and 
y j-\  is  the  bottom  element  that  is  created  by  these  rotations.  After  an  additional  Lanczos 
step  the  last  two  equations  of  the  new  system  is 


otj-\  Pj  4>j- 1 

yj- » 

Pj  <t>j 

0 

(4.3.9) 


these  equations  are  transformed  again  into  an  upper  triangular  form  by  premultiplication  of 
equation  4.3.9  by  the  plane  rotation  matrix 


CJ 

5J 


where  cy  -  cos  ^  and  sy  -  sin  i|»y.  Here  <b /  is  the  rotation  angle.  The  result  of  this  matrix 
multiplication  is 


Cjdj-i  -  SjPj 

CjPj  -  SjOj 

6j- 1 

cjYj- i 

SjPj-  1  +  CjPj 

SjPj  +  cJaJ 

s)Yj-\ 

(4.3.10) 


The  value  of  the  rotation  angle  is  determined  by  setting  the  term  in  the  lower  triangular 
part  equal  to  zero.  Thus 


tan«|»,  - 


*j- 1 


(4.3.11) 


and  Anally  *y  can  be  obtained  by  performing  the  Arst  step  of  the  becksubstitution  process. 


+j 


SJYJ- 1 

(sjPj  +  Cja  j) 


(4.3.12) 


It  should  be  pointed  out  that  the  rotation  at  this  step  effects  the  next  off-diagonal  term  in 
the  tridiagonal  matrix  and  the  effect  is  only  on  the  super-diagonal  term  which  explains  the 
Pj  term  in  equation  4.3.9  .  The  Anal  form  of  the  updating  algorithm  can  be  arranged  as  fol¬ 
lows 


tan*;  “ 


Pj 

«j- 1 


-  sin* jjij  +  cos* ya, 


1  “  COS*;^;+l 


Vj  -  sin*yyy_i 


+J~ 


ii 


This  is  a  very  inexpensive  step  and  therefore  an  attractive  form  of  updating  the  bottom  ele¬ 
ment  of  the  solution  vector.  The  residual  norm,  p;,  can  then  be  computed  from  equation 

4.3.5  . 

Contrary  to  the  Conjugate  Gradient  method  it  is  not  necessary  to  update  an  approxi¬ 
mate  solution  vector  at  each  step,  but  only  the  residual  norm  has  to  be  monitored.  Once  p y 
falls  bellow  the  speciAed  tolerance  the  Lanczos  algorithm  is  terminated  and  the  small  m 
degree  tridiagonal  system  is  solved.  It  is  very  important  to  note  that  if  A  is  indeAnite  then 
Ty  may  also  be  indeAnite  and  care  must  be  taken  in  solving  the  small  system.  For  this  rea¬ 
son  a  stable  equation  solver  such  as  the  one  used  in  the  updating  procedure  for  the  residual 
norm  should  be  used. 

Finally  the  approximate  solution  \m  has  to  be  computed.  It  is  here  that  the  Lanczos 
vectors  are  needed  again  to  construct  xm.  In  the  present  implementation  the  Lanczos  vec- 
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tors  are  put  onto  secondary  storage  as  they  are  computed  and  at  the  end  of  the  algorithm 
they  are  brought  back  one  by  one  to  form  xm. 


4.4.  Derivation  of  Conjugate  Gradients  Method  from  Lanczos  Algorithm 

The  conjugate  gradients  method  may  be  developed  directly  from  the  Lanczos  process. 
This  will  demonstrate  the  limitation  of  this  method.  The  aim  is  to  obtain  a  procedure  for 
updating  of  the  solution  vector  x ,  without  retaining  all  the  Lanczos  vectors. 

Consider  first  the  case  where  A  is  positive  definite,  then  the  resulting  tridiagonal 
matrix  will  also  be  positive  definite  and  hence  its  cholesky  factorization  exists. 

T j  -  L,L/  (4.4.1) 


Here  Ly  is  a  lower  bidiagonal  matrix  with  positive  diagonal  terms. 

A 

Ly 

*'-»8 
6J 


[M2  *2 
M3  . 


(4.4.2) 


The  main  obstacle  to  updating  Xy  is  the  fact  that  the  solution  vector  fy  to  equation 
4.2.3  changes  fully  after  each  Lanczos  step  and  therefor  Qyfy  can  not  be  simply  accumu¬ 
lated  recursively.  This  difficulty  can  be  overcome  by  defining 

tj  -  Lyrfy  (4.4.3) 

and 

Py  -  QyLy"r  (4.4.4) 

The  vectors  in  Py  are  the  conjugate  vectors.  Then  the  approximation  Xy  becomes 

Xy-Pygy  (4.4.6) 


Further,  the  weak  form  equation  now  becomes 


It  is  apparent  from  the  bidiagonal  structure  of  L j,  equation  4.4.2,  the  gj  can  be  updated 
very  easily  since  the  solution  of  equation  4.4.7  is  obtained  by  a  simple  forward  reduction. 
Therefor  gJ+1  can  be  written  in  terms  of  gy  as 


where 


ly+« 


ij 

Vj+l 


(4.4.8) 


Vj+ 1  “ 


(4.4.9) 


Thus  the  approximation  xJ+i  can  obtained  from  %j  using  equation  4.4.6  in  arranged  in  the 
following  form 

Xy+,  -  Xy  +  TJy+|P7+l  (4.4.10) 


and  finally  the  p-vectors  can  be  obtained  from  their  definition.  Thus 

Py+l  “  ”  My+iPy) 

1 


(4.4.11) 


This  is  equivalent  to  the  method  of  conjugate  gradients.  The  approach  here  is  compu¬ 
tationally  a  little  different,  but  the  role  of  the  cholesky  decomposition  becomes  apparent, 
with  the  necessary  requirement  that  A  be  positive  definite  to  ensure  numerical  stability. 

If  A  is  an  indefinite  symmetric  matrix,  the  algorithm  may  still  be  carried  out,  with  some 
success,  but  due  to  instability  it  can  no  longer  be  numerically  reliable. 


4.5.  Selective  Orthogonalization 

A  detailed  account  of  the  behaviour  of  the  Lanczos  algorithm  in  the  presence  of 
roundoff  and  of  selective  orthogonalization  is  available  in  Parlett  [18].  Therefore  in  this 
section  only  the  basic  facts  about  selective  orthogonalization  will  be  presented. 


J 


30 


Let  <  be  the  smallest  number  in  the  computer  such  that  !-»-«>  1.  This  is  known  as 
the  unit  roundoff  error.  The  basic  equations  4.1. 5.a  and  4.1. S.c  are  now  perturbed  by 
roundoff.  Although  for  each  j  4.1. 5. b  is  only  slightly  perturbed,  the  relations  4.1.S.a  and 
4.1.  S.c  completely  fail  after  a  certain  number  of  steps  depending  on  c  and  on  A.  The  Lanc- 
zos  vectors,  which  are  orthogonal  in  exact  arithmetic,  not  only  loose  their  orthogonality, 
but  even  become  linearly  dependent. 

For  a  long  time,  as  a  remedy  against  this  loss  of  orthogonality,  it  was  suggested  to 
reorthogonalize  each  new  q,  against  all  previous  Lanczos  vectors,  which  is  of  course  very 
expensive.  The  results  of  Paige  [15]  gave  some  better  insight  in  the  way  orthogonality  is 
lost,  and  provide  the  theoretical  basis  for  selective  orthogonalization. 

In  order  to  explain  Paige’s  results  it  is  necessary  to  introduce  certain  quantities  which 
are  not  of  direct  interest  when  solving  a  linear  system  of  equations.  Let  0,<J)  be  the  eigen¬ 
values  and  be  the  corresponding  normalized  eigenvectors  of  Tj 

/-  1 . j.  (4.5.1) 

From  these  one  can  compute  the  Ritz  vectors  y,^’  by 

y^-Q^  (4.5.2) 

These  quantities  change  at  each  Lanczos  step.  In  subsequent  discussions  where  there  is  no 
confusion  possible  the  superscripts  will  be  dropped.  The  pairs  (P,^>,y^> )  /— 1,...J  are 
approximate  eigenpairs  of  A.  The  quality  of  this  approximation  can  be  determined  by  con¬ 
sidering  the  residual  ||Ay,  -  y,0,||.  Applying  (4.5.2)  one  finds  that,  to  within  roundoff, 

II Ay*  -  y/0,11  <  pj,  +  II Ej  (4.5.3) 

where  E,  is  an  error  matrix  and  pJt  s  0/+,|s/f|  and  sJt  s  e/a,,  i.e.,  the  bottom  element  of 
the  corresponding  eigenvector  of  Tj.  The  play  an  important  role  in  the  process  of  selec¬ 
tive  orthogonalization. 


A 
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Now  everything  is  ready  to  state  one  of  the  most  important  consequences  of  Paige’s 
work,  which  can  be  summarized  as  follows: 

.  Loss  of  orthogonality  among  the  Lanczos  vectors  is  equivalent  to  the  convergence  of 
a  Ritz  pair. 

In  other  words,  if  one  of  the  0 y,  becomes  small,  the  corresponding  Ritz  pair  converges  to 
an  eigenpair  of  A  and  the  Lanczos  vector  qy+i  loses  its  orthogonality  to  qi,  •  •  ■  But 
more  is  known  about  the  way  qy+1  behaves,  it  is  tilted  towards  yjJ)  while  it  is  retaining  its 
previous  level  of  orthogonality  to  all  the  other  Ritz  vectors,  y^\k  *  /'.  In  order  to  main¬ 
tain  a  certain  level  of  orthogonality  it  is  therefore  only  necessary  to  orlhogonalize  the  new 
qy+1,  or  equivalently  the  unnormalized  r,  against  y/^  when  the  0y,  become  smaller  than 
some  threshold.  So  first  compute  r'y  (compare  4  1.3)  by 

r'y  S  Aqy  -  qyay  -  q,-i/3,  (4.5.4) 

as  usual,  then  check  if  any  /3y,  is  small.  If  so  then  compute  the  corresponding  ylJ)  and 
orthogonalize  r'y  against  it  obtaining 

r,  =  t'j  -  jMP  (4.5.5) 

where 


_  y^r'j 

-  Ib',0)ll2 


(4.5.6) 


This  requires  the  formation  of  y,  -  Qys,  The  payofF  for  the  expensive  calculation  comes 
later  when  subsequent  Lanczos  vectors,  say  qy+u  and  qy+jo  need  to  be  orthogonalized 
against  y,,  since  a  certain  number  p  of  Ritz  vectors  are  also  put  in  secondary  storage  and 
need  not  be  formed  again.  The  question  of  how  many  Ritz  vectors  to  keep,  and  whether  to 
keep  them  in  core  in  case  the  optimal  number  is  small  is  still  under  investigation. 

Finally  it  should  be  mentioned  that  the  threshold  for  which  a  /9y,  is  considered  to  be 
small  is  set  to  be  •%/«  JlTy ||  in  the  present  implementation.  This  choice  was  based  on  the 
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computational  experience  for  the  eigenvalue  problem  (20).  An  analysis  in  (20)  shows  that 
this  guarantees  a  certain  level  of  orthogonality  among  the  Lanczos  vectors. 

To  demonstrate  the  loss  of  orthogonality,  in  Figure  4.2,  the  variation  in  the  residual  norm 
is  plotted  for  the  simple  Lanczos  process  when  no  action  is  taken  to  maintain  orthogonality 
and  for  the  case  when  selective  orthogonalization  is  performed  to  maintain  a  level  of  ortho¬ 
gonality  among  the  Lanczos  vectors.  The  conjugate  gradient  method  and  other  methods 
based  on  the  Krylov  vectors  also  suffer  this  effect.  It  should  be  noted  that  the  loss  of  ortho¬ 
gonality  does  not  stop  convergence,  it  only  delays  it. 

4.6.  Lanczos  with  Selective  Orthogonalization 

The  final  form  of  the  Lanczos  algorithm  which  maintains  a  certain  level  of  ortho¬ 
gonality  is  summarized  in  the  following  table. 

1.  Loop:  for  j  -  1,2,  •  •  • 

1.1  Take  a  simple  Lanczos  step 

1.2  Update  residual  norm  pj 

1.3  Check:  if  p,  <  tolerance  end  the  loop 

1 .4  Update  the  eigenvalues  of  T t 

1.5  Compute  /3y,  -  /3;+lsy, 

1.6  Check:  if  /3„  <  >/7||Tj 

then  compute  the  »'-th  eigen  vector  and  orthogonalize 

2.  Solve  T;fy  -  ei/3| 

3.  Assemble  solution:  Xj  -  Qjtj 

The  above  formulation  requires  a  total  of  5  n -vectors  and  10  y- vectors  workspace. 
This  includes  two  -vectors  containing  the  right  hand  side  and  th,  initial  guess,  two  />• 
vectors  for  tj  and  q;  and  a  n -vector  for  use  in  construction  of  Ritz  vectors. 
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4.7.  Selective  Reorlbogonalization 

The  procedure  described  in  the  previous  section  is  not  the  only  way  of  maintaining 
orthogonality  among  the  Lanczos  vectors.  Recently  H.  Simon  [22]  has  proposed  an  alterna¬ 
tive  technique  based  on  a  simple  recursion  formula.  This  recursion  is  derived  as  follows. 
Define 

w,j  “  q,rQy  (4  7.1) 

and  multiplying  equation  4.1.3  by  qtrand  using  the  above  definition  we  get 

q/Aqy  -  ajo>,j  +  Pjto.j-)  +  0y+i»,.y+ 1  (4.7.1.a) 

) 

a  similar  equation  can  be  obtained  when  the  same  steps  are  applied  to  the  Mh  Lanczos 
step. 

4yrAq,  -  «*/«/,;  +  fiiVjj- i  +  Pz+iw/.i+i  (4.7.1. b) 

subtracting  the  above  two  relations  to  eliminate  the  terms  involving  A  results  in  the  recur¬ 
sion 

Pj+iuj+\.i  -  0/+i«y,/+i  +  (<*/  ~  <*/)»>./  +  1  -  Pjotj,)- 1  (4.7.3) 

This  recursion  holds  for  j  -  2,3,  •  •  •  and  K  i  <  j- 1.  To  start  the  three  step  recursion  it 
is  necessary  to  assume  that 

wjj-l  j  -  1,2,  •  •  •  (4.7.4.a) 

and 

- «  y  -  2,3,  •  •  •  (4.7.4.b) 

Using  the  above  recursion  the  loss  of  orthogonality  can  be  monitored  during  the  Lanczos 
process  simply  by  looking  at  the  magnitude  of  Whenever  |<tty,/|  is  larger  than  the 
threshold  value,  >/<||Ty||,  orthogonality  is  considered  to  be  lost  against  q,  and  reorthogonal- 
ization  must  be  performed.  Lanczos  algorithm  with  selective  reorthogonalization  then 


becomes: 
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1.  Loop:  for  j  -  1,2,  ■  •  • 

1.1  Take  a  simple  Lanczos  step 

1.2  Update  residual  norm  pj 

1.3  Check:  if  p,  <  tolerance  end  the  loop 

1.4  Update  oi,  y 

1.5  Check:  if  <utJ  >  VtjjT y|| 

then  orthogonalize  q;+i  against  previous  Lanczos  vectors. 

2.  Solve  Tjfj  -  ei/l| 

3.  Assemble  solution.  \j  -  Q;f, 

This  algorithm  has  a  number  of  desirable  advantages  over  the  previous  approach. 

(i)  The  implementation  of  this  recursion  is  very  easy  and  the  resulting  code  is  consider¬ 
ably  simpler. 

(ii)  This  approach  requires  less  storage  space. 

(iii)  There  is  no  need  for  computing  the  eigenvalues  and  eigenvectors  of  the  tridiagonal 
system. 

Initial  experiments  with  the  two  approaches  indicate  that  both  methods  flag  the  loss  of 
orthogonality  at  the  same  step.  Therefore  due  to  the  above  advantages  the  selective 
reorthogonalization  technique  is  a  better  way  of  maintaining  orthogonality  for  purpose  of 
equation  solving. 

4.8.  Starting  Vector 

The  choice  of  a  good  starting  vector  can  reduce  the  number  of  Lanczos  steps  the  algo¬ 
rithm  takes  to  converge  to  the  solution.  In  fact  if  the  true  solution  is  on  hand  the  Lanczos 
algorithm  terminates  after  one  step.  However  it  is  not  often  that  one  has  the  exact  solution 
or  even  a  good  approximation.  Nevertheless  the  question  of  starting  vectors  still  remains. 
Consider  the  extreme  case  where  the  right  hand  side  is  an  eigenvector.  Then  any  starting 
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vector  other  than  zero  o:  the  right  hand  side  will  take  more  than  one  step  to  converge, 
whereas,  a  zero  starting  vector  will  take  exactly  one  step  to  obtain  the  true  solution.  This 
can  be  seen  by  performing  the  algorithm  on  the  equation  Ax  —  z,,  where  z,  is  the  i-th 
eigenvector  with  correstxmdmg  eigenvalue  A,  Then 

Setting  qt.  -  0 
ro  -  z,  and  B.  1 

<1 )  q,  —  r c/B‘  -  *, 

(2)  u,  —  Aq,  -  A  ,z; 

(3)  r |  —  u;  -  B  i« o  ^  a  |Z| 

(4)  a  |  -■  <|  /r ;  -  \  ■ 

f  b  *  r ;  * —  r  i  ■  <r;q  —  u 

( 6 1  fi  2  *■  0 

and  (he  residual  norm  wili  oe  zero. 

In  general  the  right  hand  side  has  components  along  a  certain  number  of  eigenvectors 
and  introduction  ol  any  staging  vector  that  has  components  along  any  additional  eigenvec¬ 
tors  will  delay  convergence  Theiefo’-e,  u  can  be  said  that  unless  special  knowledge  about 
the  solution  is  avaiiah'  :.  the  best  starting  vector  is  the  zero  vector.  The  influence  of  a  ran¬ 
dom  staring  sector  o>-  !»»••  loidnal  norm  during  a  Lanczos  run  can  be  seen  in  Figure  4.3  for 
the  simple  beam  i"  ■'’•ati 


Potential  Energy 


Figure  4. 1  Monotonic  Reduction  of  the  Potential  Ei 


of  Selective  Orthogonalization  on  the  Reduction 


Vectors  on  the  Reduction 
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5.  NEWTON-LANCZOS  METHOD 

Generally  in  nonlinear  structural  analysis,  the  solution  to  the  nonlinear  equations  is 
required  at  a  number  of  load  levels  to  generate  a  complete  equilibrium  history.  The  BPGS 
method  described  in  section  3.3  attempts  to  make  use  of  the  information  at  a  previous 
iterate,  by  updating  the  factorization  of  the  tangent  matrix,  to  find  the  next  solution.  This 
updating  procedure  imposes  a  limitation  on  the  solution  algorithm.  That  is: 

The  BFGS  can  not  be  applied  to  nonlinear  problems  where  the  tangent  matrix 
becomes  indefinite  during  the  iteration  process. 

The  object  of  the  BFGS  algorithm  was  to  reduce  the  overall  number  of  matrix  factoriza¬ 
tions  but  this  could  only  be  achieved  at  the  cost  of  limiting  pie  class  of  problems  for  which 
the  method  can  be  useful 

In  addition  to  the  short  comings  of  Newton  type  methods  described  in  section  3,  all 
these  methods  including  Newton's  method  have  the  following  limitations. 

(1 )  These  methods  are  not  globally  convergent,  i.e  not  all  starting  vectors  converge  to  the 
solution. 

(2)  The  algorithm  can  not  be  applied  to  the  cases  when  the  tangent  matrix  becomes 
singular  at  some  xk. 

Here  global  convergence  means  that  if  f(x)  -  0  has  a  unique  solution  then  all  starting  vec¬ 
tors  converge  to  this  solution 
The  object  of  the  proposed  algorithm  is 

(i)  Remove  the  above  two  restrictions. 

(u)  Reduce  the  number  of  matrix  factorization. 

VI.  Description  of  the  Algorithm 

The  Newton-Lanczos  algorithm  is  based  on  the  simple  idea  that  so  long  as  the  approx- 
imate  solution  is  far  from  the  exact  solution,  it  is  not  necessary  to  solve  the  linearized 


problem  exactly,  but  only  a  modest  level  of  accuracy  is  sufficient. 

The  algorithm  consists  of  choosing  an  initial  guess  xq  and  repeating  the  following  steps  for 


k  *■  0,1,2,  •  •  ■  ,  until  convergence  is  achieved: 

(1)  given  x*  compute  f(x*) 

(2)  compute  the  tangent  matrix  A* 

,,,  c  L  .  llA*d*  +  f(x*)|| 

(3)  find  some  d*  such  that - -| -  <  tj* 

(4)  compute  sk  from  a  line  search 

(5)  update  x*+,  -  x*  +  s*d* 

(6)  test  for  convergence  and  terminate  if  converged 

Here  tj*  is  some  parameter  such  that  tj*  <  1  The  only  difference  between  this  algorithm 
and  Newton's  method  is  in  step  3.  At  this  step  the  Lanczos  algorithm  as  described  in  sec¬ 
tion  4  is  used  to  obtain  d*  and  t)*  is  the  specified  solution  tolerance  for  the  Lanczos  algo¬ 
rithm. 

It  is  evident  from  the  description  of  the  algorithm  that  if  A*  is  singular  at  some  step 
k  then  the  process  need  not  be  terminated  as  long  as  a  d*  can  be  found  that  can  achieve 
the  required  tolerance.  Under  this  circumstance  it  is  desirable  to  have  small  components  of 
the  eigenvectors  corresponding  to  the  zero  eigenvalues  along  dt.  The  behavior  of  the 
Lanczos  algorithm  is  such  that  the  process  takes  close  to  n  steps,  where  n  is  the  number  of 
unknowns,  before  it  introduces  any  component  of  these  eigenvectors  into  the  solution  vec¬ 
tor.  The  approximate  solution  is  obtained  from  a  linear  combinations  of  the  Krylov  vectors 
and  it  is  the  matrix-vector  multiplications  that  annihilates  components  of  the  eigenvectors 
corresponding  to  zero  eigenvalue.  Therefore,  it  is  possible  to  obtain  an  approximate  solu¬ 
tion  without  introducing  large  components  of  the  undesirable  eigenvectors. 

The  cost  of  step  3  can  be  reduced  by  obtaining  a  good  initial  guess  to  the  solution  of 
the  new  system.  This  can  be  done  by  solving  the  weak  form  of  the  linearized  equation  at  a 


i 


previous  step  i. 


T«'>fy  -  h, 

with  the  appropriate  right  hand  side.  The  projection  of  the  residual  vector,  f(x*),  on  the 
Lanczos  vectors  of  the  previous  step  is  identically  this  right  hand  side.  That  is 

h,  -  Q}‘)Tt(xk ) 

5.2.  Convergence  of  Newton-La nczos 

Newton-Lanczos  algorithm  has  very  attractive  convergence  properties.  The  solution 
part  of  the  algorithm  can  be  rearranged  to  get 

l|Akd*  +f(x*)||  <  T,Jf(x*)||  (5.2.1) 

The  convergence  of  Newton-Lanczos  method  can  be  proved  provided  the  following 
assumptions  hold. 

(a)  The  solution  x '  to  f(x ")  is  unique. 

(b)  The  Jacobian  matrix  exists. 

(c)  f(x ")  is  nonsingular. 

The  Taylor  expansion  about  x*  can  be  written  as 

f(x*  +  ,)  -  f(x*)  +  A*d*  +  e(d*)  (5.2.2) 

where  e(d*)  includes  all  the  higher  order  terms  in  d*.  Hence 

||f(x*+1)||  <  ||f(x*)  +  A*d*||  +  ||e(d*)||  (5.2.3) 

Using  equation  5.2.1  this  can  be  modified  to 

||f(x*+1)||  <  r,*i|f(xt)||  +  ||e(d*)||  (5.2.4) 

At  this  stage  there  are  two  possible  cases  that  need  be  considered: 

(I)  r)Jf<x*)||  <  l!e(d*|| 

This  implies  that  |jf(x*+1)!|  <  2||e(d*)||  -  CHlIdJI2)  which  in  turn  implies  at  least 
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superlinear  convergence. 

(2)  |e(4*)||  <  tfh  ||f(x*)  || 

This  inequality  can  be  stated  as  an  equality  of  the  form 

lle(4*)||  -  (5.2.5) 

where  t  is  a  positive  scalar  such  that  /  <  1,  then  equation  S.2.4  can  be  written  as 

l|f(**+»ll  <  y||f(x*>ll  (5.2.6) 

where  y  -  (i  +  D^mn,  is  *  scalar.  Choosing  rj^  such  that  17*  <  17™*  <  y  <  1  for 
all  values  of  k  Then 

^limrtx*)  -  0  (5.2.7) 


These  results  show  that  the  algorithm  is  at  least  linearly  convergent, i.e 

ll**+i  -  *’ll  <  «!**  ~  * *11  a  <  1  (5.2.8) 

It  can  also  be  observed  directly  from  the  algorithm  that  by  setting  17*  -  e,  the  computer 
precision,  the  algorithm  will  achieve  a  quadratic  rate  of  convergence. 

In  general  the  rate  of  convergence  is  superlinear. 


Il**+i  -  *  II  ^  a||x*  -  x 


(5.2.9) 


where  1  <4(17*)  <  2  and  lim  4(17)  —  2.  The  superlinear  rate  of  convergence  can  be 

'1-0 

guaranteed  by  choosing  tj*  such  that  17*  — •  0  as  xk  — »  x*.  This  can  be  achieved  very  simply 

,  v  ,  ,  »f(**)ll 

by  making  17 *  dependent  on 
chosen  as 


.  In  the  present  implementation  this  dependence  was 


v  k-V 

based  on  some  computational  experiments. 


I 


Ilf  (x*)  II 

Ilf  (*o)  H 
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S.3.  Preconditioning  of  The  Linear  Equations 


The  use  of  preconditioning  of  the  linear  system  of  equations  can  often  speed  up  con¬ 
vergence  of  the  Lanczos  algorithm  considerably.  The  number  of  Lanczos  steps  required  to 
reduce  the  residual  norm  by  a  factor  17  depends,  in  general,  on  the  distribution  of  the 
eigenvalues  of  A.  An  upper  bound,  p,  to  this  number  can  be  obtained  from  following  rela¬ 
tion 


Tk-  1 


Vk  +  1 


(5.3.1) 


where  *  is  the  condition  number  of  A,  *  -  \J\\.  This  upper  bound  is  too  crude  to  have 
any  practical  use  for  estimating  the  number  of  Lanczos  steps  but  it  can  be  seen  that  when 
the  condition  number  is  unity  Lanczos  algorithm  requires  only  one  step  to  obtain  the  exact 
solution.  The  equation  4.1.1  can  now  be  modified  to 

C-lAC-rx  —  f 


where 


and 


x  «  C'x 


r.c 


(5.3.2) 


(5.3.3) 


f  -  C  'r0 


(5.3.4) 


where  C  is  any  matrix  with  an  inverse  that  can  be  computed  easily  such  as  a  triangular 
matrix.  The  object  now  is  to  solve  5.3.2  .  The  number  of  Lanczos  steps  can  be  reduced  sub¬ 
stantially  by  choosing  C  in  such  a  way  as  to  reduce  the  condition  number  of  the  product 
matrix  of  equation  5.3.2 

The  present  implementation  of  the  Newton- Lanczos  algorithm  uses  the  Cholesky  fac¬ 
tor  of  the  tangent  matrix  at  some  previous  step  for  C.  In  this  way  the  information  is  passed 
on  from  on  iterate  to  the  next  and  therefore  reduces  the  number  of  factorizations.  How¬ 
ever,  factorizations  can  be  eliminated  completely  if  a  good  approximation  to  the  precondi¬ 
tioning  matrix  C  can  be  obtained.  Moreover,  the  preconditioning  car  be  updated  from  one 
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step  to  the  next  using  an  updating  procedure  simiier  to  the  one  used  for  the  BFGS  algo¬ 
rithm. 

5.4.  Numerical  Results 

Example  / 

This  example,  figure  5.1,  is  a  simple  truss  structure  with  a  peculiar  property.  The  ini¬ 
tial  tangent  matrix  is  singular.  The  purpose  of  this  example  is  to  demonstrate  the  global 
convergence  of  the  Newton-Lanczos  algorithm.  A  linearly  elastic,  large  displacement  truss 
element  was  used  to  model  the  structure  which  has  a  total  of  8  degrees  of  freedom. 

Using  a  zero  starting  solution,  all  the  methods  described  in  section  3,  including 
Newton’s  method,  diverged  immediately  after  the  initial  iteration  when  applied  to  this  prob¬ 
lem.  This  is  due  to  the  fact  that  a  direct  equation  solver  is  used  to  solve  the  linearized 
problem  and  since  the  tangent  matrix  is  singular  the  solution  vector  will  be  an  eigenvector 
corresponding  to  a  zero  eigenvalue  with  a  large  magnitude.  This  is  like  performing  one  step 
of  inverse  iteration  method  with  a  singular  matrix  which  always  converge  to  the  eigenvector 
with  smallest  eigenvalue.  No  line  search  was  used  for  Newton's  method  or  Modified  New¬ 
ton  method,  however  the  BFGS  algorithm  which  has  line  search  also  failed  to  converge. 

Newton-Lanczos  algorithm,  with  no  preconditioning,  when  applied  to  this  problem 
converged  to  the  correct  solution.  However  the  convergence  was  only  linear  because  of  the 
singularity.  The  convergence  rate  improved  as  the  approximate  solution  moved  away  from 
the  singularity.  Despite  the  low  rate  of  convergence,  the  fact  that  the  correct  solution  was 
obtained  shows  that  the  domain  of  attraction  of  Newton-Lanczos  is  larger  than  that  of 
Newton's  method.  The  load-displacement  curve  of  figure  5.2  gives  an  indication  of  the 
stiffening  behavior  of  the  structure. 


Example  II 


mm m 


an 


45 

For  a  second  example,  we  consider  the  shear  loading  of  a  block  of  rubber,  as  shown 
in  Figure  5.3.  The  nonlinear  elastic  behavior  of  the  rubber  was  modeled  using  a 
simplification  of  the  general  Rivlin  model,  originally  proposed  by  Mooney  (4).  The  non- 
linearities  are  present  throughout  the  loading  process.  A  nine  node  plane  stress  Lagrangian 
element,  with  penalty  formulation  (21],  was  used  to  discretize  the  problem  to  a  total  of  60 
degrees  of  freedom. 

The  four  basic  methods  were  tested  on  this  problem  and  the  results  are  tabulated  in 
Table  5.1.  The  convergence  of  the  BFGS  quasi-Newton  method  was  disappointing  since  it 
required  1 1  iterations  to  converge  at  each  step  while  the  modified  Newton  which  has  only  a 
linear  convergence  rate,  required  14  iteration  to  converge. 

The  total  number  of  factorizations  required  by  each  method  is  listed  below. 


Newton  9 

BFGS  3 

Modified  Newton  3 

Newton- Lanczos  1 


The  control  of  the  rate  of  convergence  for  Newlon-Lnaczos  can  be  seen.  Figure  5.4, 
when  the  method  was  used,  without  any  preconditioning,  to  solve  this  problem  with 
different  no  values. 

Example  III 

As  a  final  example  a  parabolic  beam  illustrated  in  Figure  5.5  was  modeled  using  4 
node,  plane  stress,  quadrilateral  elements.  The  nonlinearities  enter  the  problem  when  the 
beam  comes  into  contact  with  a  rigid  boundary.  The  contact  condition  is  enforced  by 
Lagrange  multiplier  constraints  on  those  nodes  indicating  penetration  and/or  compressive 
contact  force.  The  contact  problem  is  of  the  same  type  as  the  problem  of  constrained 
optimization. 
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Both  The  BFGS  and  the  Modified  Newton  method  failed  to  converge  when  applied  to 
this  problem.  This  is  because  the  tangent  matrix  becomes  indefinite  during  the  iteration 
process  and  the  BFGS  updates  are  incapable  of  accounting  for  the  indefiniteness  introduced 
by  the  contact  elements. 

The  structure  has  stiffening  characteristic  during  the  loading  process  and  a  softening 
characteristic  during  the  unloading  phase.  The  total  number  of  degrees  of  freedom  is  870. 
The  cost  comparisons  for  Newton’s  method  and  Newton-Lanczos  method  are  tabulated  in 
Table  S.2.  The  total  number  of  factorization  required  by  each  method  is  as  follows: 

Newton  13 

Newton-Lanczos  2 

5.5.  Closure 

The  motivation  for  the  development  of  quasi-Newton  methods  was  reduction  in  the 
overall  cost  of  the  analysis,  using  fewer  factorization  steps.  However,  the  use  of  quasi- 
Newton  methods  to  solve  nonlinear  finite  element  equations  has  been  restricted  to  prob¬ 
lems  with  positive  definite  Jacobians. 

The  Newton-Lanczos  algorithm  described  herein  not  only  requires  less  factorization 
steps  than  either  the  quasi-Newton  method  or  the  modified  Newton  method  but  also  has 
the  following  advantages  over  its  rivals: 

( 1 )  The  method  is  globally  convergent. 

(2)  The  rate  of  convergence  can  be  controlled  through  a  specified  tolerance. 

(3)  The  algorithm  does  not  terminate  in  the  case  of  a  near  singular  Jacobian  matrix. 

(4)  The  algorithm  can  be  applied  to  indefinite  systems. 

In  the  present  implementation  of  Newton-Lanczos,  the  Jacobian  matrix  is  decom¬ 
posed  into  its  Cholesky  factors  in  order  to  get  a  preconditioning  matrix  for  the  Lanczos 
algorithm.  This  factorization  is  avoided  if  an  effective  preconditioning  matrix  can  be 


J 


similar  to  the  one  used  by  BFGS  algorithm,  for  use  in 


Newton-Lanczos  alforithm. 


Plane  Stress,  Linear  Elastic  Elements 
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APPENDIX  A 


Orthogonalization  of  the  Krylov  Vectors 

Consider  the  set  of  vectors  (r^Aro  A^o,  ■  -  -  ,Ay_lro).  Assume  for  the  moment  that  a 
set  of  orthonormal  vectors,  (41,42. ' ' '  ,4 j),  are  produced  by  successivly  orthogonalizing 
A'r0  to  all  the  previous  vectors,  all  vectors  A*r0  for  k  <  /,  and  then  normalizing  the 
resulting  vector  to  obtain  the  next  orthonormal  vector  qy+|.  Repeating  this  procedure  for 
/  -  1,2,  •  •  •  J—  1  defines  the  Gram-Schmidt  method. 

At  some  step  /  <  j-l  we  have 

i 

*,  “  A'r0-  £y*q*  (Al) 

*-1 


where  y*  -  q,rA'r0  and  4,+, 


~-r  but  A '"Vo*  span  (qi,qh  •  •  •  ,4,).  This  can  be  rewrit- 
|r,| 


ten  as 


A'_lr0  —  2^i»*4* 
*-1 


(A2) 


where  vk  “  q*A'  *ro  Equations  Al  and  A2  can  now  be  combined  to  give 

I  i 

r,  -  InA(»  -  £y*q* 

t-i  *-) 


(A3) 


and  in  turn  Aq,_t  c  span  (41,42,  -  -  -  ,4,).  Therefore 

i 

r,  “  Aq,  -  £y*q*  (A4) 

*-1 

where  y*  -  q*rAq,  and  now  4,+,  - 

Hr,  II 

At  some  step  j,  the  orthonormalization  proccess  described  by  equation  A4  must  be 
performed  for  the  vector  Aqy  However,  consider  the  case  when  we  orthogonalize  this  vec¬ 
tor  only  to  the  previous  two  vectors,  q,  and  q;_|.  Then  we  get 
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tj  -  Agy  -  a/%j  -  (A5) 

where  ay  —  f/Afy  and  Pj  —  The  orthogonality  of  to  the  remaining  vectors  can 

now  be  checked  by  forming  g/ry  for  /  <  j  -  1 

«/fy  -  g<r  (Agj  -  ajqj  -  fift-i) 

-  g,rAqy  (A6) 

“  g/A  g, 

This  is  derived  using  the  orthogonality  condition  of  the  q  vectors  and  the  symmetry  pro¬ 
perty  of  A.  This  eqution  together  with  equation  A4  leads  to  the  following: 

g,ry  -  g/(|r,|g,+i  +  £y*g*)  -  0 

*-i 

This  results  shows  that  tj  is  orthogonal  to  all  the  previous  vectors  g,  i  -  1,2,  •  •  •  J. 
Therefore  to  generate  the  orthonormal  bases,  at  each  step  Aq;  need  only  be  orthogonalized 
aqainst  q;  and  qy_|- 
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APPENDIX  B 

Minimal  Potantial  Energy  over  Krylov  Space 

The  potential  energy  of  a  linear  system  is  given  by 

£"(*)  -  >/ixrAx  -  xrb  (Bl) 

where  the  £*  denotes  the  energy  in  the  n-dimentional  space.  The  minimum  of  this  scalar 

function  occurs  when 

V£*(x)  -  A*  -  b  -  0  (B2) 

When  the  variables  are  restricted  to  the  space  of  Lanczos  vectors  they  are  redefined  as 

and 

h  *  PA\  “  0iQ/Oi 

The  potantial  energy  now  becomes 

£y(fy)  -  VitJdlkQjtj  - 

-  tJtjtj  -  (B3) 

The  minimum  of  the  potantial  energy,  restricted  to  the  Krylov  space,  now  occurs  when 

V£'(fy)  -  Tyfy  -  -  0  (B4) 
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